10 research outputs found

    Joint Modeling and Registration of Cell Populations in Cohorts of High-Dimensional Flow Cytometric Data

    Get PDF
    In systems biomedicine, an experimenter encounters different potential sources of variation in data such as individual samples, multiple experimental conditions, and multi-variable network-level responses. In multiparametric cytometry, which is often used for analyzing patient samples, such issues are critical. While computational methods can identify cell populations in individual samples, without the ability to automatically match them across samples, it is difficult to compare and characterize the populations in typical experiments, such as those responding to various stimulations or distinctive of particular patients or time-points, especially when there are many samples. Joint Clustering and Matching (JCM) is a multi-level framework for simultaneous modeling and registration of populations across a cohort. JCM models every population with a robust multivariate probability distribution. Simultaneously, JCM fits a random-effects model to construct an overall batch template -- used for registering populations across samples, and classifying new samples. By tackling systems-level variation, JCM supports practical biomedical applications involving large cohorts

    RNA-SeQC: RNA-seq metrics for quality control and process optimization

    Get PDF
    Summary: RNA-seq, the application of next-generation sequencing to RNA, provides transcriptome-wide characterization of cellular activity. Assessment of sequencing performance and library quality is critical to the interpretation of RNA-seq data, yet few tools exist to address this issue. We introduce RNA-SeQC, a program which provides key measures of data quality. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron and intragenic), continuity of coverage, 3′/5′ bias and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis

    GenePattern flow cytometry suite

    Get PDF
    Background: Traditional flow cytometry data analysis is largely based on interactive and time consuming analysis of series two dimensional representations of up to 20 dimensional data. Recent technological advances have increased the amount of data generated by the technology and outpaced the development of data analysis approaches. While there are advanced tools available, including many R/BioConductor packages, these are only accessible programmatically and therefore out of reach for most experimentalists. GenePattern is a powerful genomic analysis platform with over 200 tools for analysis of gene expression, proteomics, and other data. A web-based interface provides easy access to these tools and allows the creation of automated analysis pipelines enabling reproducible research. Results: In order to bring advanced flow cytometry data analysis tools to experimentalists without programmatic skills, we developed the GenePattern Flow Cytometry Suite. It contains 34 open source GenePattern flow cytometry modules covering methods from basic processing of flow cytometry standard (i.e., FCS) files to advanced algorithms for automated identification of cell populations, normalization and quality assessment. Internally, these modules leverage from functionality developed in R/BioConductor. Using the GenePattern web-based interface, they can be connected to build analytical pipelines. Conclusions: GenePattern Flow Cytometry Suite brings advanced flow cytometry data analysis capabilities to users with minimal computer skills. Functionality previously available only to skilled bioinformaticians is now easily accessible from a web browser

    Distinct spatial characteristics of phospho-marker expression in samples from two classes of patients with different outcomes.

    No full text
    <p>(A) Heatplots provide insight into the distribution of phospho-proteomic expression of p-PLCg2 and p-STAT5 (panel 4) for (top 2 rows) and (bottom row) samples. The mound (high CD20 and BCL-2) populations are shown here. In contrast to the more symmetrically distributed, well-rounded mounds, the skewness in the mounds is clearly visible. (B) The stimulated mound (light brown histogram) of a sample is shown in contrast with the corresponding population prior to stimulation (greyish blue histogram). (C) The ability of the mound skew parameters () for 16 phospho-markers to distinguish samples across the and classes (green and pink labels respectively) is shown with a heatmap based on the corresponding posterior log-odds scores. The higher the score, the darker the corresponding entry in red/blue. Each marker name and its average posterior log-odds score over all samples are marked on the sides of the heatmap.</p

    Comprehensive molecular characterization of human colon and rectal cancer

    No full text
    To characterize somatic alterations in colorectal carcinoma, we conducted a genome-scale analysis of 276 samples, analysing exome sequence, DNA copy number, promoter methylation and messenger RNA and microRNA expression. A subset of these samples (97) underwent low-depth-of-coverage whole-genome sequencing. In total, 16% of colorectal carcinomas were found to be hypermutated: three-quarters of these had the expected high microsatellite instability, usually with hypermethylation and MLH1 silencing, and one-quarter had somatic mismatch-repair gene and polymerase ε (POLE) mutations. Excluding the hypermutated cancers, colon and rectum cancers were found to have considerably similar patterns of genomic alteration. Twenty-four genes were significantly mutated, and in addition to the expected APC, TP53, SMAD4, PIK3CA and KRAS mutations, we found frequent mutations in ARID1A, SOX9 and FAM123B. Recurrent copy-number alterations include potentially drug-targetable amplifications of ERBB2 and newly discovered amplification of IGF2. Recurrent chromosomal translocations include the fusion of NAV2 and WNT pathway member TCF7L1. Integrative analyses suggest new markers for aggressive colorectal carcinoma and an important role for MYC-directed transcriptional activation and repression.National Institutes of Health (U.S.) (Grant U24CA143799)National Institutes of Health (U.S.) (Grant U24CA143835)National Institutes of Health (U.S.) (Grant U24CA143840)National Institutes of Health (U.S.) (Grant U24CA143843)National Institutes of Health (U.S.) (Grant U24CA143845)National Institutes of Health (U.S.) (Grant U24CA143848)National Institutes of Health (U.S.) (Grant U24CA143858)National Institutes of Health (U.S.) (Grant U24CA143866)National Institutes of Health (U.S.) (Grant U24CA143867)National Institutes of Health (U.S.) (Grant U24CA143882)National Institutes of Health (U.S.) (Grant U24CA143883)National Institutes of Health (U.S.) (Grant U24CA144025)National Institutes of Health (U.S.) (Grant U54HG003067)National Institutes of Health (U.S.) (Grant U54HG003079)National Institutes of Health (U.S.) (Grant U54HG003273

    Comprehensive genomic characterization of squamous cell lung cancers

    Get PDF
    Lung squamous cell carcinoma is a common type of lung cancer, causing approximately 400,000 deaths per year worldwide. Genomic alterations in squamous cell lung cancers have not been comprehensively characterized, and no molecularly targeted agents have been specifically developed for its treatment. As part of The Cancer Genome Atlas, here we profile 178 lung squamous cell carcinomas to provide a comprehensive landscape of genomic and epigenomic alterations. We show that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour. We find statistically recurrent mutations in 11 genes, including mutation of TP53 in nearly all specimens. Previously unreported loss-of-function mutations are seen in the HLA-A class I major histocompatibility gene. Significantly altered pathways included NFE2L2 and KEAP1 in 34%, squamous differentiation genes in 44%, phosphatidylinositol-3-OH kinase pathway genes in 47%, and CDKN2A and RB1 in 72% of tumours. We identified a potential therapeutic target in most tumours, offering new avenues of investigation for the treatment of squamous cell lung cancers.National Institutes of Health (U.S.) (Grant U24 CA126561)National Institutes of Health (U.S.) (Grant U24 CA126551)National Institutes of Health (U.S.) (Grant U24 CA126554)National Institutes of Health (U.S.) (Grant U24 CA126543)National Institutes of Health (U.S.) (Grant U24 CA126546)National Institutes of Health (U.S.) (Grant U24 CA126563)National Institutes of Health (U.S.) (Grant U24 CA126544)National Institutes of Health (U.S.) (Grant U24 CA143845)National Institutes of Health (U.S.) (Grant U24 CA143858)National Institutes of Health (U.S.) (Grant U24 CA144025)National Institutes of Health (U.S.) (Grant U24 CA143882)National Institutes of Health (U.S.) (Grant U24 CA143866)National Institutes of Health (U.S.) (Grant U24 CA143867)National Institutes of Health (U.S.) (Grant U24 CA143848)National Institutes of Health (U.S.) (Grant U24 CA143840)National Institutes of Health (U.S.) (Grant U24 CA143835)National Institutes of Health (U.S.) (Grant U24 CA143799)National Institutes of Health (U.S.) (Grant U24 CA143883)National Institutes of Health (U.S.) (Grant U24 CA143843)National Institutes of Health (U.S.) (Grant U54 HG003067)National Institutes of Health (U.S.) (Grant U54 HG003079)National Institutes of Health (U.S.) (Grant U54 HG003273

    Comprehensive molecular profiling of lung adenocarcinoma

    Get PDF
    Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen(mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesisclose24

    Comprehensive molecular profiling of lung adenocarcinoma

    No full text
    Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen(mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis

    Comprehensive molecular profiling of lung adenocarcinoma: the cancer genome atlas research network

    Get PDF
    Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen (mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis
    corecore